Goto

Collaborating Authors

 visual speech recognition application


A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

Neural Information Processing Systems

We examine eight different techniques for developing visual rep(cid:173) resentations in machine vision tasks. In particular we compare different versions of principal component and independent com(cid:173) ponent analysis in combination with stepwise regression methods for variable selection. We found that local methods, based on the statistics of image patches, consistently outperformed global meth(cid:173) ods based on the statistics of entire images. This result is consistent with previous work on emotion and facial expression recognition. In addition, the use of a stepwise regression technique for selecting variables and regions of interest substantially boosted performance.


A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

Gray, Michael S., Sejnowski, Terrence J., Movellan, Javier R.

Neural Information Processing Systems

These methods are compared on their performance on a visual speech recognition task. While the representations developed are specific to visual speech recognition, the methods themselvesare general purpose and applicable to other tasks. Our focus is on low-level data-driven methods based on the statistical properties of relatively untouched images, as opposed to approaches that work with contours or highly processed versions of the image. Padgett [8] and Bartlett [1] systematically studied statistical methods for developing representations on expression recognition tasks. They found that local wavelet-like representations consistently outperformed global representations, like eigenfaces. In this paper we also compare local versus global representations.